Error diagnosis and classification of errors in two Hebrew state-of-the-art automatic speech recognition systems
نویسندگان
چکیده
In this research we diagnose two commercial automatic speech recognizers (ASRs) on a corpus of academic lectures in Hebrew. Our goal is not only to measure the engines' performance but to find out if current Hebrew ASRs' transcription can be a reasonable replacement to human transcription, or at least a significant bootstrapping for a manual post-processing of the automatic output. We performed a word error rate (WER) diagnosis and a linguistic error classification on two automatic transcriptions – Nuance's and Google's, and compared it to a real-time (RT) stenographer's records, as well as to an exact transcription that reflects excatly the speaker's speech. Results show that the ASRs‘ WER is caused by massive substitutions, while the RT transcription's errors were caused mainly due to deletions. This research provides an opportunity to explore cost/benefit aspects of automatic vs. manual audio transcriptions.
منابع مشابه
Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملCan automatic speech recognition be satisficing for audio/video search? Keyword-focused analysis of Hebrew automatic and manual transcription
With massive amounts of academic audio and video content over the web, it is important to assess the performance of state-of-the-art automatic speech recognition (ASR) systems for audio/video navigation through search queries. This paper suggests a novel perspective of the challenges of ASR: instead of minimizing word error rates (WER), focus on keyword recognition. Focusing on keywords may be ...
متن کاملRecent Improvements on Error Detection for Automatic Speech Recognition
Automatic speech recognition(ASR) offers the ability to access the semantic content present in spoken language within audio and video documents. While acoustic models based on deep neural networks have recently significantly improved the performances of ASR systems, automatic transcriptions still contain errors. Errors perturb the exploitation of these ASR outputs by introducing noise to the te...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملA CAD System Framework for the Automatic Diagnosis and Annotation of Histological and Bone Marrow Images
Due to ever increasing of medical images data in the world’s medical centers and recent developments in hardware and technology of medical imaging, necessity of medical data software analysis is needed. Equipping medical science with intelligent tools in diagnosis and treatment of illnesses has resulted in reduction of physicians’ errors and physical and financial damages. In this article we pr...
متن کامل